Model Selection

Multi-speaker support

# Multi-speaker support

Kartoffel Orpheus 3B German Natural V0.1

A German text-to-speech (TTS) model based on Orpheus-3B, primarily fine-tuned on natural human speech recordings, aiming to achieve realistic sound.

Speech Synthesis

Transformers Supports Multiple Languages

Kartoffel Orpheus 3B German Synthetic V0.1

A German text-to-speech (TTS) model based on Orpheus-3B, supporting multiple speakers and emotional expression.

Speech Synthesis

Transformers Supports Multiple Languages

Sauerkrauttts Preview 0.1 Q4 K M GGUF

SauerkrautTTS-Preview-0.1 is a German text-to-speech (TTS) model fine-tuned from a powerful base model, offering four unique German speaker voices.

Speech Synthesis German

Kokoro 82M V1.1 Zh

Kokoro is an open-weight series of small yet powerful text-to-speech (TTS) models, now featuring data from 100 Chinese speakers sourced from professional datasets.

Speech Synthesis

YarnGPT is a text-to-speech (TTS) model specifically designed for synthesizing Nigerian-accented English, utilizing pure language modeling technology to deliver high-quality, natural, and culturally relevant speech synthesis for diverse applications.

Speech Synthesis

Transformers English

Hindi Text To Speech Tts

Hindi text-to-speech model fine-tuned based on microsoft/speecht5_tts

Speech Synthesis

F5-TTS is a fully non-autoregressive zero-shot text-to-speech model that supports high-quality speech synthesis.

Speech Synthesis

Tts Ru Free Hf Vits Low Multispeaker

A Russian text-to-speech model supporting multiple speakers, capable of directly processing plain text with punctuation without prior conversion to phonemes.

Speech Synthesis

Transformers Other

Speecht5 Tts Arabic

Arabic text-to-speech model fine-tuned on Microsoft's SpeechT5 architecture, trained on the Hakawati dataset

Speech Synthesis

Transformers Arabic

Matxa Tts Cat Multispeaker

A Catalan multi-speaker text-to-speech model based on Matcha-TTS architecture, trained with optimal transport conditional flow matching for fast and high-quality speech synthesis

Speech Synthesis Other

This is a Russian text-to-speech model based on the VITS architecture, capable of converting Russian text into natural speech.

Speech Synthesis

Transformers Other

VITS is an end-to-end speech synthesis model capable of predicting corresponding speech waveforms from input text sequences. The model employs a conditional variational autoencoder (VAE) architecture, including a posterior encoder, decoder, and conditional prior module.

Speech Synthesis

kakao-enterprise

VITS is an end-to-end speech synthesis model capable of predicting corresponding speech waveforms from input text sequences.

Speech Synthesis

kakao-enterprise

Nvidia Tts En Hifitts Hifigan Ft Fastpitch

HiFiGAN is a GAN-based vocoder model capable of generating high-quality audio from mel-spectrograms, supporting multi-speaker English voice synthesis.

Speech Synthesis English

Mastering-Python-HF

Speecht5 Tts Common Voice 5 Sv

A Swedish text-to-speech model fine-tuned based on Microsoft's SpeechT5 architecture, trained using the Common Voice dataset

Speech Synthesis

Transformers Other

Kan Bayashi Libritts Xvector Vits

A text-to-speech model trained using the ESPnet framework, trained on the LibriTTS dataset, supporting English speech synthesis.

Speech Synthesis English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase